Genetics Selection Evolution — Latest Matching Preprints

1

Impact of genomic preselection on subsequent genetic evaluations with ssGBLUP - using real data from pigs

Jibrila, I.; Vandenplas, J.; ten Napel, J.; Bergsma, R.; Veerkamp, R. F.; Calus, M. P. L.

2021-06-19 genetics 10.1101/2021.06.18.449002 medRxiv

Top 0.1%

89.9%

Show abstract

BackgroundEmpirically assessing the impact of preselection on subsequent genetic evaluations of preselected animals requires comparison of scenarios taking into account different approaches, including scenarios without preselection. However, preselection almost always takes place in animal breeding programs, so it is difficult to have a dataset without preselection. Hence most studies on preselection used simulated datasets, concluding that genomic estimated breeding values (GEBV) from subsequent single-step genomic best linear unbiased prediction (ssGBLUP) evaluations are unbiased. The aim of this study was to investigate the impact of genomic preselection (GPS) on accuracy and bias in subsequent ssGBLUP evaluations, using data from a commercial pig breeding program. MethodsWe used data on four pig production traits from one sire line and one dam line. The traits are average daily gain during performance testing, average daily gain throughout life, backfat thickness, and loin depth. As these traits had different weights in the breeding goals of the two lines, we analyzed the two lines separately. Per line, we had a reference GPS scenario which kept all available data, against which the next two scenarios were compared. We then implemented two other scenarios with additional layers of GPS by removing all animals without progeny either i) only in the validation generation, or ii) in all generations. We conducted subsequent ssGBLUP evaluations per GPS scenario, utilizing all the data remaining after implementing the GPS scenario. In computing accuracy and bias, we compared GEBV against progeny yield deviations of validation animals. ResultsResults for all traits in both lines showed marginal loss in accuracy due to the additional layers of GPS. Average accuracy across all GPS scenarios in both lines was 0.39, 0.47, 0.56, and 0.60 respectively for the four traits considered in this study. Bias was largely absent, and when present did not differ greatly among corresponding GPS scenarios. ConclusionAs preselection generally has the same effect in animal breeding programs, we concluded that impact of preselection is generally minimal on accuracy and bias in subsequent ssGBLUP evaluations of selection candidates in pigs and in other animal breeding programs.

2

Population history of Swedish cattle breeds: estimates and model checking

Adepoju, D.; Ohlsson, J. I.; Klingström, T.; Rius-Vilarrasa, E.; Johansson, A. M.; Johnsson, M.

2024-10-04 genetics 10.1101/2024.10.03.616479 medRxiv

Top 0.1%

79.6%

Show abstract

In this work, we use linkage disequilibrium-based methods to estimate recent population history from genotype data in Swedish cattle breeds, as well as international Holstein and Jersey cattle data for comparison. Our results suggest that these breeds have been effectively large up until recently, when they declined around the onset of systematic breeding. The inferred trajectories were qualitatively similar, with a large historical population and one decline. We used population genetic simulation to check the inferences. When comparing simulations from the inferred population histories to real data, the proportion low-frequency variants in real data was different than was implied by the inferred population histories, and there was somewhat higher genomic inbreeding in real data than implied by the inferred histories. The inferred population histories imply that much of the variation we see today is transient, and it will be lost as the populations settle into a new equilibrium, even if efforts to maintain effective population size at current levels are successful.

3

Accounting for nuclear and mito genome in dairy cattle breeding - a simulation study

Mafra Fortuna, G.; Zumbach, B. J.; Johnsson, M.; Pocrnic, I.; Gorjanc, G.

2023-11-21 genetics 10.1101/2023.11.20.567907 medRxiv

Top 0.1%

74.9%

Show abstract

Mitochondria play a significant role in numerous cellular processes through proteins encoded by both nuclear genome (nDNA) and mito genome (mDNA). While the variation in nDNA is influenced by mutations and recombination of parental genomes, the variation in mDNA is solely driven by mutations. In addition, mDNA is inherited in a haploid form, from the dam. Cattle populations show significant variation in mDNA between and within breeds. Past research suggests that variation in mDNA accounts for 1-5% of the phenotypic variation in dairy traits. Here we simulated a dairy cattle breeding program to assess the impact of accounting for mDNA variation in pedigree-based and genome-based genetic evaluations on the accuracy of estimated breeding values for mDNA and nDNA components. We also examined the impact of alternative definitions of breeding values on genetic gain, including nDNA and mDNA components that both impact phenotype expression, but mDNA is inherited only maternally. We found that accounting for mDNA variation increased accuracy between +0.01 and +0.05 for different categories of animals, especially for young bulls (+0.05) and females without genotype data (between +0.01 and +0.03). Different scenarios of modelling and breeding value definition impacted genetic gain. The standard approach of ignoring mDNA variation achieved competitive genetic gain. Modelling, but not selecting on mDNA expectedly reduced genetic gain, while optimal use of mDNA variation recovered the genetic gain.

4

Evaluating genotyping strategies for a small managed population with simulation

Martin, A. A. A.; Schoenebeck, J.; Clements, D. N.; Lewis, T.; Wiener, P.; Gorjanc, G.

2025-01-25 genomics 10.1101/2025.01.23.634495 medRxiv

Top 0.1%

71.8%

Show abstract

BackgroundCollecting genomic information is crucial to advance breeding for complex traits such as health, welfare, and behaviour in domesticated populations. For that purpose, different data collection scenarios can be envisioned based on the number of individuals, the number of markers, and the genotyping technology. This study developed a simulation framework, based on a service dog population, aiming to identify an optimal and cost-effective genotyping strategy that would support the implementation of genomic selection, investigation of the genetic architecture of traits of interest, and track loci of interest. MethodsWe simulated a population based on the existing pedigree, using the gene drop method in AlphaSimR. The existing pedigree was extended with additional progeny generations to evaluate the outcomes of different genotyping strategies in the future. We generated genotype data based on existing high-coverage whole-genome sequences (WGS) for the current breeding dogs and evaluated different scenarios for genotyping the progeny. The genotyping options considered SNP arrays of various densities and WGS callsets produced from different sequencing depths. We then phased and imputed the genotype data to high-coverage WGS using AlphaPeel. ResultsAll scenarios were compared based on individual imputation accuracy against the simulated true whole-genome genotype. Averaged over five generations of simulated progeny, low-pass sequencing (0.5 to 2X depth) achieved accuracies of 0.998 to 0.999. The accuracy of SNP array genotyping (25K to 710K markers) was lower, with means of 0.911 to 0.938. ConclusionsOur simulation was tailored to identify the most cost-effective and efficient strategy for downstream use in genomic selection and genetic research into traits and loci of interest. Low-pass sequencing outperformed SNP array genotyping in imputation accuracy of whole-genome genotypes as expected. Additionally, low-pass sequencing technology was the most affordable genotyping approach currently available for dogs. Thus, it appears to be the optimal choice for balancing the goals of regimented breeding programmes such as those that produce service dogs. This simulation framework could also be adapted to address other objectives for breeding organisations working with small populations.

5

Across-breed analyses of genome-wide association studies for stature and mammary gland morphology in cattle reveal pleiotropic effects of the Friesian POLLED haplotype

Watson, N.; He, Q.; Kadri, N.; Leonard, A. S.; Seefried, F. R.; Pausch, H.

2025-10-08 genetics 10.1101/2025.10.08.681232 medRxiv

Top 0.1%

70.6%

Show abstract

BackgroundGenome-wide association studies (GWAS) in cattle populations have traditionally relied on progeny-derived phenotypes such as estimated breeding values (EBVs) as input phenotypes to identify additive quantitative trait loci (QTL) for complex traits. Increasing availability of cow genotype data now enables GWAS using own performance records to detect both additive and non-additive QTL. ResultsSequence-variant genotypes were imputed for 57,863 cows from the Holstein, Brown Swiss, Original Braunvieh, and Simmental cattle populations that had own performance records for stature and three mammary gland morphology traits (fore udder position, central ligament, front teat position). Genomic heritability ranged from 0.25 to 0.33 for fore udder position, 0.27 to 0.43 for udder central ligament, 0.49 to 0.59 for front teat position and 0.61 to 0.73 for stature. Additive genetic effects explained most of the SNP-based heritability for all traits and breeds. Within-breed genome-wide association studies identified 118 additive and 29 non-additive QTL for the four traits. Non-additive associations were only detected for stature. Although the majority of lead variants were in non-coding regions, we prioritized four missense variants in HMGA2 (rs385670251), ZBTB20 (rs470925681), ARSI (rs447362502) and CRAMP1 (rs445465383) as plausible causal variants for four stature QTL. Meta-analysis of the additive GWAS identified 63 mammary gland morphology and 43 stature QTL. A Holstein-specific mammary gland morphology QTL (Chr1:2748715, p=7.59e-35) colocalized with the POLLED locus on chromosome 1. Fine-mapping of this region revealed undesired effects of the Friesian POLLED haplotype on mammary gland morphology traits. ConclusionDirect phenotypes for a large cohort of genotyped cows provide high statistical power for additive and non-additive association testing. Sequence-based association studies revealed QTL and candidate causal variants for stature and mammary gland morphology traits. Pleiotropic effects of the Friesian POLLED haplotype highlight the need for careful monitoring of potential unintended consequences when selecting for polledness in cattle.

6

Discovering genomic regions associated with the phenotypic differentiation of European local pig breeds

Poklukar, K.; Mestre, C.; Skrlep, M.; Candek-Potokar, M.; Ovilo, C.; Fontanesi, L.; Riquet, J.; Bovo, S.; Schiavo, G.; Ribani, A.; Munoz, M.; Bozzi, R.; Charneca, R.; Quintanilla, R.; Kusec, G.; Mercat, M.-J.; Zimmer, C.; Razmaite, V.; Araujo, J. P.; Radovic, C.; Karolyi, D.; Servin, B.

2022-02-22 genetics 10.1101/2022.02.22.481248 medRxiv

Top 0.1%

69.8%

Show abstract

BackgroundIntensive selection of modern pig breeds resulted in genetic improvement of productive traits while local pig breeds remained less performant. As they have been bred in extensive systems, they have adapted to specifical environmental conditions resulting in a rich genotypic and phenotypic diversity. This study is based on European local pig breeds genetically characterized using DNA-pool sequencing data and phenotypically characterized using breed level phenotypes related to stature, fatness, growth and reproductive performance traits. These data were analyzed using a dedicated approach to detect selection signatures linked to phenotypic traits in order to uncover potential candidate genes that may be under adaptation to specific environments. ResultsGenetic data analysis of European pig breeds revealed four main axes of genetic variation represented by Iberian and modern breeds (i.e. Large White, Landrace, and Duroc). In addition, breeds clustered according to their geographical origin, for example French Gascon and Basque breeds, Italian Apulo Calabrese and Casertana breeds, Spanish Iberian and Portuguese Alentejano breeds. Principal component analysis of phenotypic data distinguished between larger and leaner breeds with better growth potential and reproductive performance on one hand and breeds that were smaller, fatter, and had low growth and reproductive efficiency on the other hand. Linking selection signatures with phenotype identified 16 significant genomic regions associated with stature, 24 with fatness, 2 with growth and 192 with reproduction. Among them, several regions contained candidate genes with possible biological effect on stature, fatness, growth and reproduction performance traits. For example, strong associations were found for stature in two regions containing the ANXA4 and ANTXR1 genes, for fatness containing the DNMT3A and POMC genes and for reproductive performance containing the HSD17B7 gene. ConclusionsThe present study on European local pig breeds used a dedicated approach for searching selection signatures supported by phenotypic data at the breed level to identify potential candidate genes that may have adapted to different living environments and production systems. Results can be useful to define conservation programs of local pig breeds.

7

Genomic data enables genetic evaluation using data recorded on LMIC smallholder dairy farms

Powell, O. M.; Mrode, R.; Gaynor, R. C.; Johnsson, M.; Gorjanc, G. M.; Hickey, J. M.

2019-11-02 genetics 10.1101/827956 medRxiv

Top 0.1%

66.6%

Show abstract

BackgroundGenetic evaluation is a central component of a breeding program. In advanced economies, most genetic evaluations depend on large quantities of data that are recorded on commercial farms. Large herd sizes and widespread use of artificial insemination create strong genetic connectedness that enables the genetic and environmental effects of an individual animals phenotype to be accurately separated. In contrast to this, herds are neither large nor have strong genetic connectedness in smallholder dairy production systems of many low to middle-income countries (LMIC). This limits genetic evaluation, and furthermore, the pedigree information needed for traditional genetic evaluation is typically unavailable. Genomic information keeps track of shared haplotypes rather than shared relatives. This information could capture and strengthen genetic connectedness between herds and through this may enable genetic evaluations for LMIC smallholder dairy farms. The objective of this study was to use simulation to quantify the power of genomic information to enable genetic evaluation under such conditions.\n\nResultsThe results from this study show: (i) the genetic evaluation of phenotyped cows using genomic information had higher accuracy compared to pedigree information across all breeding designs; (ii) the genetic evaluation of phenotyped cows with genomic information and modelling herd as a random effect had higher or equal accuracy compared to modelling herd as a fixed effect; (iii) the genetic evaluation of phenotyped cows from breeding designs with strong genetic connectedness had higher accuracy compared to breeding designs with weaker genetic connectedness; (iv) genomic prediction of young bulls was possible using marker estimates from the genetic evaluations of their phenotyped dams. For example, the accuracy of genomic prediction of young bulls from an average herd size of 1 (=1.58) was 0.40 under a breeding design with 1,000 sires mated per generation and a training set of 8,000 phenotyped and genotyped cows.\n\nConclusionsThis study demonstrates the potential of genomic information to be an enabling technology in LMIC smallholder dairy production systems by facilitating genetic evaluations with in-situ records collected from farms with herd sizes of four cows or less. Across a range of breeding designs, genomic data enabled accurate genetic evaluation of phenotyped cows and genomic prediction of young bulls using data sets that contained small herds with weak genetic connections. The use of smallholder dairy data in genetic evaluations would enable the establishment of breeding programs to improve in-situ germplasm and, if required, would enable the importation of the most suitable external germplasm. This could be individually tailored for each target environment. Together this would increase the productivity, profitability and sustainability of LMIC smallholder dairy production systems. However, data collection, including genomic data, is expensive and business models will need to be carefully constructed so that the costs are sustainably offset.

8

Application of a French cattle pangenome, from structural variant discovery to association studies on key phenotypes

Sorin, V.; Naji, M.-M.; Birbes, C.; Grohs, C.; Escouflaire, C.; Fritz, S.; Eche, C.; Marcuzzo, C.; Suin, A.; Donnadieu, C.; Gaspin, C.; Iampietro, C.; Milan, D.; Drouilhet, L.; Tosser-Klopp, G.; Boichard, D.; Klopp, C.; Sanchez, M.-P.; Boussaha, M.

2025-04-18 genetics 10.1101/2025.04.15.648672 medRxiv

Top 0.1%

65.0%

Show abstract

BackgroundThe current cattle reference genome assembly, a pseudo-linear sequence produced using sequences from a single Hereford cow, represent a limit when performing genetic studies, especially when investigating the whole spectrum of genetic variations within the species. Detecting structural variations (SVs) poses significant challenges when relying solely on conventional methods of short or long-read sequence mapping to the current bovine genome assembly. ResultsIn this study, we used long-reads (LR) and bioinformatic tools to construct a comprehensive bovine pangenome incorporating genetic diversity of 64 good quality de novo genome assemblies representing 14 French dairy and beef cattle breeds. Using a combination of complementary approaches, we explored the pangenome graph and identified 2.563 Gb of sequences common to all samples, and cumulated 0.295 Gb of variable sequences. Notably, we discovered 0.159 Gb of novel sequences not present in the current Hereford reference genome assembly. Our analysis also revealed 109,275 SVs, of which 84,612 were bi-allelic, including 21,840 insertions and 21,340 deletions. Genome-wide association studies using SNPs and a panel of 221 SVs, shared between the pangenome and the EuroGMD chip, revealed several well-known QTLs across the genome for the Holstein, Montbeliarde and Normande breeds. Among those, a QTL on chromosome 11 presents an SV with a highly significant effect on stature in the Holstein breed. This SV is a 6.2 kb deletion affecting the 5UTR, first exon and part of first intron of MATN3 gene, suggesting a potential regulatory and coding effect. ConclusionsOur study provides new insights into the genetic diversity of 14 French dairy and beef breeds and highlights the utility of pangenome graphs in capturing structural variation. The identified SV associated with stature highlights the importance of integrating SVs into GWAS for a more comprehensive understanding of complex traits.

9

Comparison of breeding strategies for the creation of a synthetic pig line

Ganteil, A.; Pook, T.; Rodriguez-Ramilo, S. T.; Ligonesche, B.; Larzul, C.

2021-09-24 genetics 10.1101/2021.09.22.461330 medRxiv

Top 0.1%

61.5%

Show abstract

Creating a new synthetic line by crossbreeding means complementary traits from pure breeds can be combined in the new population. Although diversity is generated during the crossbreeding stage, in this study, we analyze diversity management before selection starts. Using genomic and phenotypic data from animals belonging to the first generation (G0) of a new line, different simulations were run to evaluate diversity management during the first generations of a new line and to test the effects of starting selection at two alternative times, G3 and G4. Genetic diversity was characterized by allele frequency, inbreeding coefficients based on genomic and pedigree data, and expected heterozygosity. Breeding values were extracted at each generation to evaluate differences in starting selection at G3 or G4. All simulations were run for ten generations. A scenario with genomic data to manage diversity during the first generations of a new line was compared with a random and a selection scenario. As expected, loss of diversity was higher in the selection scenario, while the scenario with diversity control preserved diversity. We also combined the diversity management strategy with different selection scenarios involving different degrees of diversity control. Our simulation results show that a diversity management strategy combining genomic data with selection starting at G4 and a moderate degree of diversity control generates genetic progress and preserves diversity.

10

PEC: a robust algorithm to reconcile pedigree and SNP-chip data on the basis of LD block, haplotype information, and Mendelian conflicts

Fu, C.; Mei, Q.; Miao, Y.; Xiang, T.

2026-06-04 genetics 10.64898/2026.06.01.729286 medRxiv

Top 0.1%

61.2%

Show abstract

MotivationPedigree errors frequently occur in livestock populations due to long-term manual record-keeping, which reduces the efficiency of breeding programs. Although several pedigree correction methods exist, their practical application is often limited by complicated procedures, high computational cost, and insufficient accuracy. Therefore, an effective and efficient solution for pedigree error correction is needed. ResultsWe developed a new algorithm and software, PEC, to accurately and efficiently correct pedigree errors. The method matches haplotype fragments between candidate parents and offspring using estimated linkage disequilibrium patterns and subsequently checks for Mendelian conflicts to adjust the pedigree. Using simulated pig datasets, we compared PEC against SeekParentF90 and AlphaAssign in terms of accuracy, memory usage, and computation time. PEC demonstrated superior performance across all metrics. Furthermore, application of single-step genomic best linear unbiased prediction (ssGBLUP) in a real pig population showed that PEC corrected pedigrees significantly improved the accuracy and unbiasedness of genomic evaluations, highlighting the importance of pedigree error correction. AvailabilityThe PEC software is freely available at https://github.com/TXiang-lab/JPEC.

11

Genetic Modeling of Dyadic Behavioral Traits: Implications for Estimation and Interpretation of Variance Components

Jiang, X.; Siegford, J.; Steibel, J. P.

2026-06-12 genetics 10.64898/2026.06.10.731434 medRxiv

Top 0.1%

60.1%

Show abstract

Studying the genomic control of dyadic social interactions is gaining traction in animal genetics. However, genetic modeling of social interactions poses several challenges, one of which is whether social interactions should be treated as dyadic traits or as aggregated traits at the individual level. In this study, we systematically compared two approaches: dyadic models using dyadic traits and marginal models using marginally aggregated traits and we derived the algebraic relationships between their variance components. In the application, we used a published dataset on post-mixing aggression in pigs, including both directed and undirected aggression records collected during the 9-hour period after mixing among 797 finishing pigs in 59 social groups, as an example to show how model choice can affect variance estimation. Results showed that dyadic models can estimate genetic effects and permanent environmental effects by exploiting repeated dyadic interaction records, thereby enabling a more complete understanding of the sources of variation underlying social interactions. In contrast, marginal models can bias the estimation and interpretation of genetic components, as the aggregated genetic variance may be confounded with other variance components due to the aggregation of dyadic traits. Marginal models may also lead to overestimation of social group and residual variance. These results can provide useful guidance for choosing appropriate modeling strategies for social interaction traits.

12

Genomic selection accuracy and bias using imputed genotypes on growth, welfare and fitness traits in two Pekin duck lines

Matika, O.; Tarsani, E. A.; McIntosh, K.; Desire, S. G.; Kebede, F. G.; Talenti, A. G.; Rae, A. M.; Kranis, A.; Watson, K. A.

2025-12-26 genomics 10.64898/2025.12.24.696349 medRxiv

Top 0.1%

59.8%

Show abstract

The current study investigated the genomic selection accuracies and biases estimates from two commercial Pekin duck lines reared under commercial breeding practices. A large dataset of 26K duck records comprising both phenotype and imputed genotype information (60K chip) were analysed for growth, welfare and primary feather length traits. First, we employed mixed linear models with relationship matrices computed from the pedigree (BLUP) or markers (GBLUP) to estimate the variance components and breeding values. Then, we estimated the selection accuracies and selection biases to assess the more appropriate models. Our results showed moderately high imputation accuracies of 0.93 and 0.92 for lines A and D respectively. In both lines, the heritability estimates obtained using the pedigree were generally higher than using genomic markers in all traits considered. These ranged for juvenile weight (JW) from 0.22{+/-}0.01 vs 0.25{+/-}0.01 in line A vs line D using marker information to 0.39{+/-}0.02 to 0.50{+/-}0.02 using the pedigree in line A vs line D for slaughter body weight (BW). We observed very low estimates of heritability for gait 0.07{+/-}0.01 using markers in both lines. Breast muscle depth (BD) also had lower estimates of 0.15-0.16 using markers. For line A, the genomic predictions were generally higher when using the G-matrix than the A-matrix with the highest prediction was for BW (r2=0.68-0.70) and JW with r2 of 0.49. The estimates for gait and foot pad dermatitis (FPD) were greatly improved by using the G-Matrix at 0.58 vs 0.24 and 0.68 vs 0.44 respectively for markers vs pedigree information. For line D, the same improvements for G-Matrix vs A-Matrix were observed with estimates for BD being similar in the two lines. However, for BD the G-Matrix greatly improved the estimates from 0.50 to 0.71 unlike in line A where they remained at 0.50. The bias in line A were minimal (0.01- 0.19) using the G-Matrix compared to 0.02- 0.41 when using A-Matrix. The highest observed bias was for JW followed by BD for the G-matrix whereas when using the A-matrix we observed higher biases in many traits (JW, BW, BD and gait). The biases for line D were generally lower for the G-matrix (0.02 - 0.17 vs 0.00 - 0.19) than those observed in line A using markers whereas higher biases were observed using the pedigree (0.01 - 0.37). Current findings pinpointed that all traits were heritable with higher prediction accuracies and lower biases when using GBLUP as opposed to traditional BLUP. The present study demonstrates the effectiveness of GBLUP for improving prediction accuracy and reducing bias in selection traits of Pekin ducks, particularly for traits with low heritability. Author SummaryThe study explored genomic selection in two commercial Pekin duck lines. Using a large dataset of 26,000 records, including phenotype and genotype data, researchers analyzed growth, welfare, and feather length traits. They applied statistical models to assess variance components and breeding values, comparing traditional pedigree-based methods (BLUP) with genomic marker-based methods (GBLUP). Results showed high imputation accuracies (93% for line A and 92% for line D). Heritability estimates varied, with genomic markers generally producing lower estimates than pedigrees, except for traits like gait and breast muscle depth where genomic predictions were superior. For example, line A showed higher accuracy using genomic data for body weight and juvenile weight. Overall, genomic predictions (GBLUP) provided higher accuracy and lower bias compared to traditional methods, especially for traits with low heritability. This highlights the effectiveness of GBLUP in improving selection processes in Pekin ducks.

13

Genetic parameters and genome-wide association analysis of service sire effect on litter size and its relationship with boar semen quality in three terminal sire lines

Chen, C.-Y.; Lourenco, D.; Kleve-Feld, M.; Bhatnagar, A.; Holl, J.

2025-08-06 genetics 10.1101/2025.08.01.668164 medRxiv

Top 0.1%

59.6%

Show abstract

This study aimed to estimate the genetic parameters for service sire effects on the number born alive (NBA) and its relationship with semen quality traits. Data of 6,416, 23,188, and 48,890 litter size records collected between 2020 and 2024 from three purebred terminal sire lines were analyzed. The number of sows was 3,071, 11,819, and 24,089, with 197, 554, and 891 service sires for the three respective lines. There were 1,424,858, 4,344,630, and 2,146,583 animals in the pedigree of which 67,990, 259,250, and 365,392 were genotyped and imputed up to 50K SNPs. The service sire and dam effects were modeled as additive genetic effects, considering a covariance structure among them, given by the relationship matrix. The model also included fixed effects of contemporary group and parity group with random permanent environmental effects for service sire and dam. Genetic parameters were estimated using the AIREML option in the BLUPF90+ software, and GEBVs were generated by ssGBLUP with the algorithm for proven and young (APY). Heritability for the service sire effect ranged from 0.01 to 0.03, and the heritability for the dam effect ranged from 0.09 to 0.15, with genetic correlations ranging from -0.20 to 0.37. A single-step genome-wide association study (ssGWAS) was also performed, and no strong signals were detected for service sire effects on NBA. Sperm motility (MOT) and total abnormal morphology (ABN_MOR) GEBVs from the three lines were estimated based on about 57,500 to 608,996 ejaculates recorded using the computer-assisted semen analysis (CASA) system from 1,698 to 29,095 boars collected from 2013 to 2025. Heritability estimates ranged from 0.10 to 0.12 for motility and from 0.17 to 0.28 for total abnormal morphology. The correlations between service sire GEBV for NBA and semen quality were low but in the favorable direction. Results suggest that although paternal genetic contributions to litter size were small compared to maternal genetic contributions, selecting on service sire effects on litter size in addition to semen quality traits will improve overall reproductive success.

14

Comparison of two multi-trait association testing methods and sequence-based fine mapping of six QTL in Swiss Large White pigs

Noskova, A.; Mehrotra, A.; Kadri, N. K.; Llores-Villas, A.; Neuenschwander, S.; Hofer, A.; Pausch, H.

2022-12-15 genomics 10.1101/2022.12.13.520268 medRxiv

Top 0.1%

59.6%

Show abstract

BackgroundGenetic correlations between complex traits suggest that pleiotropic variants contribute to trait variation. Genome-wide association studies (GWAS) aim to uncover the genetic underpinnings of traits. Multivariate association testing and the meta-analysis of summary statistics from single-trait GWAS enable detecting variants associated with multiple phenotypes. In this study, we used array-derived genotypes and phenotypes for 24 reproduction, production, and conformation traits to explore differences between the two methods and used imputed sequence variant genotypes to fine-map six quantitative trait loci (QTL). ResultsWe considered genotypes at 44,733 SNPs for 5,753 pigs from the Swiss Large White breed that had deregressed breeding values for 24 traits. Single-trait association analyses revealed eleven QTL that affected 15 traits. Multi-trait association testing and the meta-analysis of the single-trait GWAS revealed between 3 and 6 QTL, respectively, in three groups of traits. The multi-trait methods revealed three loci that were not detected in the single-trait GWAS. Four QTL that were identified in the single-trait GWAS, remained undetected in the multi-trait analyses. To pinpoint candidate causal variants for the QTL, we imputed the array-derived genotypes to the sequence level using a sequenced reference panel consisting of 421 pigs. This approach provided genotypes at 16 million imputed sequence variants with a mean accuracy of imputation of 0.94. The fine-mapping of six QTL with imputed sequence variant genotypes revealed four previously proposed causal mutations among the top variants. ConclusionsOur findings in a medium-size cohort of pigs suggest that multivariate association testing and the meta-analysis of summary statistics from single-trait GWAS provide very similar results. Although multi-trait association methods provide a useful overview of pleiotropic loci segregating in mapping populations, the investigation of single-trait association studies is still advised, as multi-trait methods may miss QTL that are uncovered in single-trait GWAS.

15

Trajectories of genetic correlations in populations under selection: from theory to a case-study

Cuyabano, B. C.; Motta, M. R.; Vandenplas, J.; Garcia, N. L.; Shokor, F.; Croiseau, P.; Boichard, D.; Aguerre, S.; Mattalia, S.

2025-03-15 genetics 10.1101/2025.03.13.643026 medRxiv

Top 0.1%

59.6%

Show abstract

BackgroundBreeding programs select for multiple commercial traits, aiming to achieve genetic progress for all. Often, selection is based on a selection index, i.e. a linear combination of traits with weights defined by, among other information, the genetic correlation between traits. These correlations are typically estimated as a static parameter, and assumed equal to all individuals and generations. While research on the consequences of selection to genetic variances (Bulmer effect) is widely available, only a few studies focused on the consequences of selection to genetic correlations. Our study extended the already existing inferences about how selection affects genetic variances, to how multi-trait selection affects genetic correlations. In order to further our understanding of genetic correlations, we also proposed an alternative method to calculate genetic correlations between traits at the individual level, called by us as individualized sire genetic correlation (iSGC), obtained through the estimated breeding values (EBV) from evaluated daughters. Lastly, a case-study was performed on thirty years of data from the French Holstein dairy cattle population, for five traits studied pairwise: milk and protein yield, milking speed, somatic cell score, and cow conception rate. ResultsTheory revealed that multi-trait selection leads to an attenuation (decrease) of positive genetic correlations, with potential to revert them to negative values, if initially low. Uncorrelated traits will become negatively correlated, and negative genetic correlations will be either intensified or attenuated (decrease or increase, respectively), depending on selection intensity, weights applied to the selection index, and the initial genetic correlation. ConclusionBoth theory and empirical results on real data confirm that selection does change the genetic correlation between traits in a population under selection. Moreover, empirical trajectories of the iSGC were in better agreement with the theory, than trajectories of populational genetic correlations. The iSGC searches for individual-specific patterns of correlations, and since it is measured on sires through the EBV of their daughters, it also considers the recombination of the genetic background. Along with the fact that trajectories of iSGC were in better agreement with theory, we believe it to be a potentially less biased measure of genetic correlations between traits.

16

Genome-wide association analyses of multiple traits in Duroc pigs using low-coverage whole-genome sequencing strategy

Yang, R.; Guo, X.; Zhu, D.; Bian, C.; Zhao, Y.; Wang, Y.; Hu, X.; Li, N.

2019-09-05 genetics 10.1101/754671 medRxiv

Top 0.1%

59.0%

Show abstract

High-density markers discovered in large size samples are essential for mapping complex traits at the gene-level resolution for agricultural livestock and crops. However, the unavailability of large reference panels and array designs for a target population of agricultural species limits the improvement of array-based genotype imputation. Recent studies showed very low coverage sequencing (LCS) of a large number of individuals is a cost-effective approach to discover variations in much greater detail in association studies. Here, we performed cohort-wide whole-genome sequencing at an average depth of 0.73x and identified more than 11.3 M SNPs. We also evaluated the data set and performed genome-wide association analysis (GWAS) in 2885 Duroc boars. We compared two different pipelines and selected a proper method (BaseVar/STITCH) for LCS analyses and determined that sequencing of 1000 individuals with 0.2x depth is enough for identifying SNPs with high accuracy in this population. Of the seven association signals derived from the genome-wide association analysis of the LCS variants, which were associated with four economic traits, we found two QTLs with narrow intervals were possibly responsible for the teat number and back fat thickness traits and identified 7 missense variants in a single sequencing step. This strategy (BaseVar/STITCH) is generally applicable to any populations and any species which have no suitable reference panels. These findings show that the LCS strategy is a proper approach for the construction of new genetic resources to facilitate genome-wide association studies, fine mapping of QTLs, and genomic selection, and implicate that it can be widely used for agricultural animal breeding in the future.

17

Marker effect p-values for single-step SNP-BLUP genomic models

Adekale, D. J.; Liu, Z.; Alkhoder, H.; Segelke, D.; Thaller, G.; Tetens, J.

2025-11-01 genomics 10.1101/2025.10.30.685527 medRxiv

Top 0.1%

57.2%

Show abstract

BackgroundSingle-step Single Nucleotide Polymorphism best linear unbiased prediction (ssSNPBLUP) is a comprehensive method for obtaining genomically enhanced breeding values for animals and SNP effects in a single evaluation. The ssSNPBLUP model integrates phenotypic, pedigree, and genomic data for genomic evaluations. However, there has been no framework for estimating the reliability and p-values of the SNP effects obtained from a ssSNPBLUP genomic model. This study investigates the reliability and significance of the SNP effects estimated using a ssSNPBLUP framework in German Limousin (LIM) and Holstein (HOL) cattle populations. MethodsThis study introduces a novel approach for calculating p-values within the ssSNPBLUP framework and compares it to a conventional single-marker regression GWAS approach. SNP reliabilities were computed using prediction error variances of SNP effect estimates, enabling the identification of statistically significant SNP markers. LIM data included weaning weight (200-DW) evaluated with a maternal effect BLUP model, while HOL data comprised production traits (milk yield, protein yield, fat yield, and somatic cell score) analysed via a random regression test-day model. ResultsThe results reveal significant SNP effects in both LIM and HOL evaluations, with notable differences attributed to the size of the reference populations. Average SNP reliabilities were higher in HOL (Mean SNP reliability: 0.42) compared to LIM (Mean SNP reliability: 0.02), underscoring the critical role of the size of the reference population in determining the accuracy and reliability of SNP effects obtained from genomic evaluations. ConclusionsThe calculation of p-values from the ssSNPBLUP framework offers an efficient approach to identify quantitative trait loci (QTL) that significantly influences traits in populations. Our approach provides a framework that could be implemented in large and complex datasets such as those used in many national routine evaluations, where only a proportion of the animals are genotyped.

18

Impact of genomic selection on genetic diversity in five local European cattle breeds

Bonifazi, R.; Meuwissen, T. H.; Croiseau, P.; Restoux, G.; Minery, S.; Windig, J.

2025-09-08 genetics 10.1101/2025.09.08.674844 medRxiv

Top 0.1%

56.6%

Show abstract

Genomic selection (GS) has revolutionised animal breeding and accelerated genetic gains in breeding programs. While GS has become common in major dairy cattle breeds, its implementation in local breeds has begun only more recently or is still in progress. However, the introduction of GS in some major breeds has also been associated with increased inbreeding rates, raising concerns about the potential effects of GS on the genetic diversity in smaller or local breeds. Our aim was to investigate the impact of GS on genetic diversity in five (small) local cattle breeds from three European countries. The five breeds evaluated were: MRY (from the Netherlands), Norwegian Red (from Norway), Abondance, Tarentaise, and Vosgienne (from France). We investigated changes in population demographic structure, as well as trends and rates of kinship and inbreeding, using both pedigree- and genomic-based measures. The population size varied depending on the breed, with Vosgienne being the smallest and Norwegian Red being the largest. The dataset included 4,645 MRY, 193,489 Norwegian Red, 16,427 Abondance, 8,882 Tarentaise, and 4,466 Vosgienne genotyped animals for more than 40,000 single-nucleotide polymorphisms. Overall, following the implementation of GS in these breeds, we observed a reduction in generation intervals for sires, fewer calves that later became sires, and, for the French breeds, a broader sire usage. Such changes were likely due to GS enabling the preselection and screening of more young bulls. Additionally, we observed a more balanced contribution of the top ten sires after the introduction of GS. Although changes in inbreeding and kinship rates occurred after the introduction of GS, there was no consistent pattern across breeds: rates increased in MRY and Tarentaise, but decreased in Norwegian Red, Abondance, and Vosgienne. Our study suggests that changes and increases in inbreeding rates may occur after the introduction of GS, although they may not be directly due to the introduction of GS per se, but rather due to population management strategies, such as optimal contribution selection or other breeding practices implemented at the nucleus level. Our findings emphasise the importance of monitoring changes in both genetic diversity and population demographic structure after implementing GS in local breeds, as well as adjusting breeding strategies when needed to ensure long-term sustainability. Interpretive SummaryGenomic selection (GS) has transformed cattle breeding. We investigated changes in population demographic structure and genetic diversity in five European breeds after the introduction of GS. Changes in inbreeding and kinship rates were not consistent across breeds, with both increases and decreases observed. Genetic management strategies, such as optimal contribution selection, had a greater impact on maintaining genetic diversity than the introduction of GS per se. These findings highlight the need to monitor changes in genetic diversity and population demographic structure after the implementation of GS and, when needed, to adapt management strategies to ensure long-term sustainability.

19

Integrating temporal genomic and transcriptomic analyses to decipher genetic basis of feed intake in dairy cattle

James, C.; Fang, L.; Wu, Z.; Hope, J.; Coffey, M.; Li, B.

2026-02-09 genomics 10.64898/2026.02.07.704532 medRxiv

Top 0.1%

52.7%

Show abstract

BackgroundFood intake is a complex trait in living organisms, where the genetics of food intake have been widely studied in humans, mice, Drosophila, cattle, pigs, chicken, and fish. In dairy cattle, intake of feed is highly linked to individuals energy balance, health, production, efficiency, and the environmental footprint of the individual to the society. Recent studies have provided solid evidence of the genetic variation of feed intake (FI) in dairy cattle population, but the genetic basis and molecular mechanism of dairy feed intake is still far from clear especially considering the lactation cycles of dairy cattle. This study aims to integrate stage-dependent genome-wide association (GWA) analyses, regional heritability mapping (RHM), and RNA-seq gene expression analyses to identify temporal functional variants associated with cattle dry matter intake (DMI) across multiple stages in lactation cycles. A total of 750,000 daily DMI records from 7,500 lactations of 2,300 cows were available with animals genotype and pedigree information. Total RNA-seq from blood were generated for 121 individuals in this population from 2 lactation stages. Data were split into multiple lactations stages for GWA, RHM, and transcriptomic analyses. ResultsStage-dependent GWAS and RHM identified 21 significant loci associated with DMI across multiple lactation stages. A total of 45 candidate genes were identified from GWA and RHM. Among all the 45 genes, six genes were later found significantly differently expressed between high and low feed intake animal groups using gene expression information from RNA-seq data. These genes show links to sugar and adipose metabolism, milk production, body weight, dopamine-reward pathways and immune functions. ConclusionsOur multi-omics analyses provide molecular evidence that the genetic basis of cattle DMI across lactation is not static. Temporal genomic variants associated with FI were identified with their transcriptomic patterns investigated, decoding the molecular mechanisms underlying DMI. Overall, the associated variants and candidate genes uncovered herein decoded genetic architecture of dairy feed intake on a temporal and multi-omics basis, enhancing the understanding of basic biology of dairy feed intake and informing breeding strategies aimed at improving dairy feed efficiency.

20

Correlation scan: identifying genomic regions that affect genetic correlations applied to fertility traits

Olasege, B. S.; Porto-Neto, L. R.; Tahir, M. S.; Gouveia, G. C.; Canovas, A.; Hayes, B.; Fortes, M.

2021-11-05 genetics 10.1101/2021.11.05.467409 medRxiv

Top 0.1%

52.3%

Show abstract

Reproductive traits are often genetically correlated. Yet, we dont fully understand the complexities, synergism, or trade-offs between male and female fertility. Here, we introduce correlation scan, a novel framework for identifying the drivers or antagonizers of the genetic correlation between male and female fertility traits across the bovine genome. The identification of these regions facilitates the understanding of the complexity of these traits. Although the methodology was applied to cattle phenotypes, using high-density SNP genotypes, the general framework developed can be applied to any species or traits, and it can easily accommodate genome sequence data.